home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Shareware Grab Bag
/
Shareware Grab Bag.iso
/
007
/
a86v311c.arc
/
11MACRO.DOC
< prev
next >
Wrap
Text File
|
1987-09-25
|
27KB
|
683 lines
CHAPTER 11 MACROS AND CONDITIONAL ASSEMBLY 11-1
Macro Facility
----- --------
A86 contains an easy-to-use, but very powerful macro facility.
The facility subsumes the capabilities of most assemblers,
including operand concatenation, indefinite repeat (often called
IRP), and indefinite-repeat character (IRPC). Unlike other
assemblers, A86 integrates these functions into the main macro
facility; so they can be invoked without clumsy syntax, or
strange characters in the macro-call operands.
Simple Macro Syntax
All macros must be defined before they are used. A macro
definition consists of the name of the macro, followed by the
word MACRO, followed by the text of the macro, followed by #EM,
which marks the end of the macro.
Many assembly languages require a list of dummy operand-names to
follow the word MACRO. A86 does not: the operands are denoted in
the text with the fixed names #1, #2, #3, ... up to a limit of
#9, for each operand in order. If there is anything following
the word MACRO, it is considered part of the macro text.
Examples:
; CLEAR sets the register-operand to zero.
CLEAR MACRO SUB #1,#1 #EM
CLEAR AX ; generates a SUB AX,AX instruction
CLEAR BX ; generates a SUB BX,BX instruction
; MOVM moves the second operand to the first operand. Both operands can be
; memory-variables.
MOVM MACRO
MOV AL,#2
MOV #1,AL
#EM
VAR1 DB ?
VAR2 DB ?
MOVM VAR1,VAR2 ; generates MOV AL,VAR2 followed by MOV VAR1,AL
11-2
Formatting in macro definitions and calls
The format of a macro definition is flexible. If the macro text
consists of a single instruction, the definition can be given in
a single line, as in the CLEAR macro given above. There is no
particular advantage to doing this, however: the assembler prunes
all unnecessary spaces, blank lines, and comments from the macro
text before entering the text into the symbol table. I recommend
the more spread-out format of the MOVM macro, for program
readability.
All special macro-operators within a macro definition begin with
a hash-sign # (a hex 23 byte). The letters following the hash-
sign can be given in either upper-case or lower-case. Hash-sign
operators are recognized even within quoted strings. If you wish
the hash-sign to be treated literally, and not as the start of a
special macro-operator, you must give 2 consecutive hash signs:
##. For example:
FOO MACRO
DB '##1'
DB '#1'
#em
FOO abc ; produces DB '#1' followed by DB 'abc'
The format of the macro call line is also flexible. A macro call
consists of the name of the macro, followed by the operands to be
plugged into the macro. The assembler prunes leading and
trailing blanks from the operands of a macro call. The operands
to a macro call are always separated by commas. Also, as in all
assembler source lines, a semi-colon occurring outside of a
quoted-string is the start of a comment, ignored by the
assembler. If you want to include commas, blanks, or semi-colons
in your operands, you must enclose your operand in single-quotes.
Macro operand substitution
Some macro assemblers expect the operands to macro calls to
follow the same syntax as the operands to instructions. In those
assemblers, the operands are parsed, and reduced to numeric
values before being plugged into the macro definition text. This
is called "passing by value". A86 does not pass by value, it
passes by text. The only parsing of operands done by the macro
processor is to determine the start and the finish of the operand
text. That text is substituted, without regard for its contents,
for the "#n" that appears in the macro definition. The text is
interpreted by the assembler only after a complete line is
expanded and as it is assembled.
11-3
If the first non-blank character after the macro name is a comma,
then the first operand is null: any occurrences of #1 in the
macro text will be deleted, and replaced with nothing. Likewise,
any two consecutive commas with no non-blanks between them will
result in the corresponding null operand. Also, out-of-range
operands are null; for example, #3 is a null operand if only two
operands are provided in the call.
Null operands to macros are not in themselves illegal. They will
produce errors only if the resulting macro expansion is illegal.
The method of passing by text allows operand-text to be plugged
anywhere into a macro, even within symbol names. For example:
; KF_ENTRY creates an entry in the KFUNCS table, consisting of a
; pointer to a KF_-action-routine. It also declares the
; corresponding CF_-symbol, which is the index within the table
; for that entry.
KF_ENTRY MACRO
CF_#1 EQU ($-KFUNCS)/2+080
DW KF_#1
#EM
KFUNCS:
KF_ENTRY UP
KF_ENTRY DOWN
; The above code is equivalent to:
;
; KFUNCS:
; DW KF_UP
; DW KF_DOWN
;
; CF_UP EQU 080
; CF_DOWN EQU 081
Quoted-string operands
As mentioned before, if you want to include blanks, commas, or
semicolons in your operands, you enclose the operand in single-
quotes. In the vast majority of cases in which these special
characters need to be part of operands, the user wants them to be
quoted in the final, assembled line also. Therefore, the quotes
are passed in the operand. To override this, and strip the
quotes from the string, you precede the quoted string with a
hash-sign. Examples:
11-4
DBW MACRO
DB #1
DW #2
#EM
DBW 'E', E_POINTER
DBW 'W', W_POINTER
; note that if quotes were not passed, the above lines would have
; to be DBW '''E''', E_POINTER; DBW '''W''', W_POINTER
GENERAL_PUSH MACRO
PUSH#1
#EM
GENERAL_PUSH F ; generates a PUSHF instruction
GENERAL_PUSH #' AX' ; generates a PUSH AX instruction
The fact that I could not come up with a more useful example than
GENERAL_PUSH is strong evidence that it is much better to pass
the quotes as the default action.
Looping by operands in macros
This macro facility contains two kinds of loops: you can loop
once for each operand in a range of operands; or you can loop
once for each character within an operand. The first kind of
loop, the R-loop, is discussed in this section; the second kind,
the C-loop, is discussed later.
An R-loop is a stretch of macro-definition code that is repeated
when the macro is expanded. In addition to the fixed operands #1
through #9, you can specify a variable operand, whose number
changes each time through the loop. You give the variable
operand one of the 4 names #W, #X, #Y, or #Z.
An R-loop begins with #R, followed immediately by the letter
W,X,Y, or Z naming the variable, followed by the number of the
first operand to be used, followed by the number of the last
operand to be used. After the #Rxnn is the text to be repeated.
The R-loop ends with #ER. For example:
STORE3 MACRO
MOV AX,#1
#RY24 ; "repeat for Y running from 2 through 4"
MOV #Y,AX
#ER
#EM
STORE3 VAR1,VAR2,VAR3,VAR4
; the above call produces the 4 instructions MOV AX,VAR1; MOV VAR2,AX;
; MOV VAR3,AX; MOV VAR4,AX.
11-5
The #L last operator and indefinite repeats
The macro facility recognizes the special operator #L, which is
the last operand in a macro call. #L can appear anywhere in
macro text; but its big power occurs in conjunction with R-loops,
to yield an indefinite-repeat facility.
A common example is as follows: you can take any macro that is
designed for one operand, and easily convert it into a macro that
accepts any number of operands. You do this by placing the
command #RX1L, "repeat for X running from 1 through L", at the
start of the macro, and the command #ER at the end just before
the #EM. Finally, you replace all instances of #1 in the macro
with #X. We see how this works with the CLEAR macro:
CLEAR MACRO #RX1L
SUB #X,#X
#ER
#EM
CLEAR AX,BX ; generates both SUB AX,AX and SUB BX,BX in one
macro!
It is possible for R-loops to iterate zero times. In this case,
the loop-text is skipped completely. For example, CLEAR without
any operands would produce no expanded text.
Character-loops
We have seen the R-loop; now we discuss the other kind of loop in
macros, the character-loop, or C-loop. In the C-loop, the
variable W,X,Y, or Z does not represent an entire operand; it
represents a character within an operand.
You start a C-loop with #C, followed by one of the 4 letters
W,X,Y, or Z, followed by a single operand-specifier. Following
the #Cxn is the text of the C-loop. The C-loop ends with #EC.
The macro will loop once for every character in the operand.
That single character will be substituted for each instance of
the indicated variable-operand. For example:
PUSHC MACRO #CW1
PUSH #WX
#EC#EM
PUSHC ABC ; generates the 3 instructions PUSH AX; PUSH BX; PUSH CX
If the C-operand is quoted in the macro call, the quotes ARE
removed from the operand before passing characters to the loop.
It is not necessary to precede the quoted string with a hash-
sign in this case. If you do, the hash-sign will be passed as
the first character.
If the C-operand is a null operand (no characters in it), the
loop-text is skipped completely.
11-6
The "B"-before and "A"-after operators
So far, we have seen that you can specify operands in your macro
in fourteen different ways: 1,2,3,4,5,6,7,8,9,W,X,Y,Z,L. We now
multiply these 14 possibilities, by introducing the "A" and "B"
operators. You can precede any of the 14 specifiers with "A" or
"B", to get the adjacent operand after or before the specified
operand. For example, BL means the operand just before the last
operand; in other words, the second-to-the-last operand. AZ
means the operand just after the Z operand. You can even repeat,
up to a limit of 4 "B"s or 3 "A"s: BBL is the third-to-last
operand; #AAA9 can be used where you would want to (but cannot)
use #12.
In the case of the variable operand to a C-loop, the "A" and "B"
specifiers denote the characters before or after the current
looping-character. An example of this is given in the next
section.
Multiple-increments within loops
We have seen that you end an R-loop with a #ER, and you end a C-
loop with a #EC. We now present another way to end these loops;
a way that lets you specify a larger increment to the macro's
loop-counter. You can end your loops with one of the 4
additional commands #E1, #E2, #E3, or #E4.
For R-loops terminated by #ER, the variable-operand advances to
the next operand when the loop is made. If you end your R-loop
with #E2, the variable-operand advances 2 operands, not just one.
For #E3, it advances 3 operands; for #E4, 4 operands. The #E1
command is the same as #ER.
The most common usage of this feature is as follows: You will
recall that we generalized the CLEAR macro with an R-loop, so
that it would take an indefinite number of operands. Suppose we
want to do the same thing with the DBW macro. We would like DBW
to take any number of operands, and alternate DBs and DWs
indefinitely on the operands. This is made possible by creating
an R-loop terminated by #E2:
DBW MACRO #RX1L
DB #X
DW #AX
#E2
#EM
DBW 'E',E_POINTER, 'W',W_POINTER ; two pairs on same line!
The #E2 terminator means that we are looping on a pair of
operands. Note the crucial usage of the "A"-after operator to
specify the second operand of the operand-pair.
11-7
A special note applies to the DBW macro above: the assembler just
happens to accept a DW directive with no operands (it generates
no object code, and issues no error). This means that DBW will
accept an odd number of operands with no error, and do the
expected thing (it alternates bytes and words, ending with a
byte).
You could likewise generalize a macro with 3 or 4 operands, to an
indefinite number of triples or quadruples; by ending the R-loop
with #E3 or #E4. The operands in each group would be specified
by #X, #AX, #AAX, and, for #E4, #AAAX.
For C-loops terminated by #E1 through #E4, the character-pointer
is advanced the specified number of characters. You use this in
much the same way as for R-loops, to create loops on pairs,
triplets, and quadruplets of characters. For example:
PUSHC2 MACRO #CZ1
PUSH #Z#AZ
#E2
#EM
PUSHC2 AXBXSIDI ; generates PUSH AX; PUSH BX; PUSH SI; PUSH DI
Negative R-loops
We now introduce another form of R-loop, called the Q-loop-- the
negative repeat-loop. This loop is the same as the R-loop,
except that the operand number decrements instead of increments;
and the loop exits when the number falls below the finish-number,
not above it. The Q-loop is specified by #Qxnn instead of #Rxnn,
and #EQ instead of #ER. You can also use the multiple-decrement
forms #E1 #E2 #E3 or #E4 to terminate an Q-loop.
Example:
MOVN MACRO #QXL2 ; "negative-repeat X from L down to 2"
MOV #BX,#X
#EQ#EM
MOVN AX,BX,CX,DX ; generates the three instructions:
; MOV CX,DX
; MOV BX,CX
; MOV AX,BX
Note: the above functionality is already built into the MOV
instruction of the assembler. The macro shows how you would
implement it if you did not already have this facility.
Nesting of loops in macros
This macro facility allows nesting of loops within each other.
Since we provide the 4 identifiers W,X,Y,Z for the loop-operands,
you can nest to a level of 4 without restriction-- just use a
different letter for each nesting level. You can nest even
deeper, subject to the restriction that a letter W,X,Y,Z refers
to the innermost containing loop that defines it.
11-8
Implied closing of loops
If you have a loop or loops ending when the macro ends, and if
the iteration count for those loops is 1, you may omit the #ER,
#EC, or #EQ. The assembler closes all open loops when it sees
#EM, with no error.
For example, if you omit the #ER for the loop-version of the
CLEAR macro, it would make no difference-- the assembler
automatically places an #ER code into the macro definition for
you.
Local labels in macros
Some assemblers have a LOCAL pseudo-op that is used in
conjunction with macros. Symbols declared LOCAL to a macro have
unique (and bizarre) symbol-names substituted for them each time
the macro is called. This solves the problem of duplicate label
definitions when a macro is called more than once.
In A86, the problem is solved more elegantly, by having a class
of generic local labels throughout assembly, not just in macros.
Recall that symbols consisting of a single letter, followed by
one or more decimal digits, can be redefined. You can use such
labels in your macro definitions.
I have recommended that local labels outside of macros be
designated L1 through L9. Within macro definitions, I suggest
that you use labels M1 through M9. If you used an Ln-label
within a macro, you would have to make sure that you never call
the macro within the range of definition of another Ln-label with
the same name. By using Mn-labels, you avoid such potential
conflicts.
The following example of a local label within a macro is taken
from the source of the macro-processor itself:
11-9
; "JHASH label" checks to see if AL is a hash sign. If it is,
; it processes the hash-sign term, and jumps to label.
; Otherwise, it drops through to the following code.
JHASH MACRO
CMP AL,'##' ; is the scanned character a hash-sign?
JNE >M1 ; skip if not
CALL MDEF_HASH ; process the hash sign
JMP #1 ; jump to the label provided
M1:
#EM
...
L3: ; loop here to eat empty lines, leading blanks
CALL SKIP_BLANKS ; skip over the leading blanks of a line
INC SI ; advance source ptr beyond the next non-blank
JHASH L3 ; if hash-sign then process, and eat more blanks
CMP AL,0A ; were the blanks terminated by a linefeed?
JE L3 ; loop if yes, nothing on this line
L5: ; loop here after a line is seen to have contents
CMP AL,';' ; have we reached the start of a comment?
JE L1 ; jump if yes, to consume the comment
JHASH >L6 ; if hash-sign then process it; get next char
...
L6:
LODSB ; fetch the next definition-char from the source
CMP AL,' ' ; is it blank?
JA L5 ; loop if not, to process it
...
Debugging macro expansions
There is a tool called EXMAC which will help you troubleshoot
program lines that call macros. If you are not sure about what
code is being generated by your macro calls, EXMAC will tell you.
See Chapter 13 for details.
11-10
Conditional Assembly
----------- --------
A86 has a conditional assembly feature, that allows you to
specify that blocks of source code will or will not be assembled,
according to the values of equated user symbols. The controlling
symbols can be declared in the program (and can thus be the
result of assembly-time expressions), or they can be declared in
the assembler invocation.
You should keep in mind the difference between conditional
assembly, invoked by #IF, and the structured-programming feature,
invoked by IF without the hash-sign. #IF tests a condition at
assembly-time, and can cause code to not be assembled and thus
not appear in the program. IF causes code to be assembled that
tests a condition at run-time, possibly jumping over code. The
skipped code will always appear in the program.
All conditional assembly lines are identified by a hash-sign #
as the first non-blank character of a line. The hash-sign is
followed by one of the four keywords IF, ELSEIF, ELSE or ENDIF.
#IF starts a conditional-assembly block. On the same line,
following the #IF, you provide a name. If the name is undefined,
or if it has been equated to zero, then the following lines of
code are skipped, up to the next matching #ELSEIF, #ELSE, or
#ENDIF. If the name is non-zero, then the following lines of
code are assembled normally. If a subsequent matching #ELSEIF or
#ELSE is encountered, then code is skipped up to the matching
#ENDIF.
#ELSEIF provides a multiple-choice facility for #IF-blocks. You
can give any number of #ELSEIFs between an #IF and its matching
#ENDIF. Each #ELSEIF has a name following it on the same line.
If the name following the #IF has zero value, then the assembler
looks for the first non-zero name following an #ELSEIF, and
assembles that block of code. If there are no non-zero #ELSEIFs,
then the #ELSE-block (if there is one) is assembled.
It is legal to provide an undefined name after #IF or #ELSEIF.
The name is interpreted as being false (zero), with no error.
You may precede the name in an #IF or #ELSEIF line with an
exclamation point "!", which acts as a NOT-operator: code will be
skipped if the name is non-zero instead of zero.
#ELSE marks the beginning of code to be assembled if all the
previous blocks of an #IF have been skipped over. There is no
operand after the #ELSE. There can be at most one #ELSE in an
#IF-block, and it must appear after any #ELSEIFs.
#ENDIF marks the end of an #IF-block. There is no operand after
#ENDIF.
It is legal to have nested #IF-blocks; that is, #IF-blocks that
are contained within other #IF-blocks. #ELSEIF, #ELSE, and
#ENDIF always refer to the innermost nested #IF-block.
11-11
As an example of conditional assembly, suppose that you have a
program that comes in three versions: one for Texas, one for
Oklahoma, and one for the rest of the nation. The three programs
differ in a limited number of places. Instead of keeping three
different versions of the source code, you can keep one version,
and use conditional assembly on the boolean variables TEXAS and
OKLAHOMA to control the assembler output. A sample block would
be:
#if TEXAS
DB 0,1,2,3
#elseif OKLAHOMA
DB 4,5,6,7
#else
DB 8,9,10,11
#endif
If a block of code is to be assembled only if TEXAS is false,
then you would use the exclamation-point operator:
#if !TEXAS
DB 0FF
#endif
Conditional Assembly and Macros
You may have conditional-assembly blocks either in macro-
definitions or in macro expansions. The only limitation is that
if you have an #IF-block in a macro expansion, the entire block
(i.e., the matching #ENDIF) must appear in the same macro
expansion. You cannot, for example, define a macro that is a
synonym for #IF.
To have your conditional-assembly block apply to the macro
definition, you provide the block normally within the definition.
For example:
X1 EQU 0
BAZ MACRO
#if X1
DB 010
#else
DB 011
#endif
#EM
BAZ
X1 EQU 1
BAZ
In the above sequence of code, the conditional-assembly block is
acted upon when the macro BAZ is defined. The macro therefore
consists of the single line DB 011, with all the conditional-
assembly lines removed from the definition. Thus, both
expansions of BAZ produce the object-code byte of 011, even
though the local label X1 has turned non-zero for the second
invocation.
11-12
To have your conditional-assembly block appear in the macro
expansion, you must literalize the hash-sign on each
conditional-assembly line by giving two hash-signs:
X1 EQU 0
BAZ MACRO
##if X1
DB 010
##else
DB 011
##endif
#EM
BAZ
X1 EQU 1
BAZ
Now the entire conditional-assembly block is stored in the macro
definition, and acted upon each time the macro is expanded.
Thus, the two invocations of BAZ will produce the different
object bytes 011 and 010, since X1 has become non-zero for the
second expansion.
You will usually want your conditional-assembly blocks to be
acted upon at macro-definition time, to save symbol-table space.
You will thus use the first form, with the single hash-signs.
Conditional Assembly and the XREF Program
In most cases, the XREF program will recognize conditional-
assembly blocks, and ignore skipped-code in its XREF compilation.
The last macro example above, however, is an example in which
XREF will not skip the same blocks that the assembler will;
because it falls under the following
WARNING: The XREF program will use the value of all symbols as it
existed at the end of assembly. XREF does not parse
statements that change the value of local variables! Thus, if
you have conditional assembly based on a variable whose value
changes during assembly, XREF will compile different source
than the assembler assembled.
The above warning does not apply to invocation-variables,
described below. If you wish to change the value of a
conditional-control variable during assembly, and if you wish
XREF to give accurate results, you should change the variable
between file-names in the invocation, as described below.
11-13
Declaring Variables in the Assembler Invocation
To facilitate the effective use of conditional assembly, this
assembler allows you to declare boolean (true-false) symbols in
the command-line that invokes the assembler. The declarations
can appear anywhere in the list of source file names. They are
distinguished from the file names by a leading equals-sign =. To
declare a symbol TRUE (value = 1), give the name after the
equals-sign. DO NOT put any spaces between the equals-sign and
the name! To declare a symbol FALSE (value = 0), you can give an
equals-sign, an exclamation-point, then the name. Again, DO NOT
embed any blanks! Example: if your source files are src1.8,
src2.8, and src3.8, then you can assemble with TEXAS true by
invoking the assembler as follows:
a86 =TEXAS src1.8 src2.8 src3.8
You can assemble with TEXAS explicitly set to FALSE as follows:
a86 =!TEXAS src1.8 src2.8 src3.8
Note that if TEXAS is used only as a conditional-assembly
control, then you do not need to include the =!TEXAS in the
invocation, because an undefined TEXAS will automatically be
interpreted as false.
Null Invocation Variable Names
The assembler will ignore an equals-sign by itself in the
invocation line, without error. This allows you to generate
assembler-invocation lines using parameters that could be either
boolean-variable-names, or null strings. For example, in the
previously-mentioned TEXAS-OKLAHOMA-nation example, the program
could be invoked via a .BAT file called "AMAKE.BAT", coded as
follows:
A86 =%1 *.8
You invoke the assembler by typing one of the following:
amake texas
amake oklahoma
amake
The third line will produce the assembler-invocation A86 = *.8;
causing no invocation-variables to be declared. Thus both TEXAS
and OKLAHOMA will be false, which is exactly what you want for
the rest-of-the-nation version of the program.
11-14
Changing Values of Invocation Variables
The usual prohibition against changing the value of a symbol that
is not a local-label does not apply to invocation-variables. For
example, suppose you have a conditional-control variable DEBUG,
which will generate diagnostic code for debugging when it is
true. Suppose further that you have already debugged source
files src1.8 and src3.8; but you are still working on src2.8. You
may invoke the assembler as follows:
A86 src1.8 =DEBUG src2.8 =!DEBUG src3.8
The variable DEBUG will be TRUE only during assembly of src2.8,
just as you want.